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ABSTRACT 

We present the results of the GREAT08 Challenge, a blind analysis challenge to infer 
^ ' weak gravitational lensing shear distortions from images. The primary goal was to 

k> , stimulate new ideas by presenting the problem to researchers outside the shear mea- 

; I ' surement community. Six GREAT08 Team methods were presented at the launch of 

I the Challenge and five additional groups submitted results during the 6 month com- 

petition. Participants analyzed 30 million simulated galaxies with a range in signal to 
noise ratio, point-spread function ellipticity, galaxy size, and galaxy type. The large 
quantity of simulations allowed shear measurement methods to be assessed at a level 
of accuracy suitable for currently planned future cosmic shear observations for the first 
time. Different methods perform well in different parts of simulation parameter space 
and come close to the target level of accuracy in several of these. A number of fresh 
ideas have emerged as a result of the Challenge including a re-examination of the pro- 
cess of combining information from different galaxies, which reduces the dependence 
on realistic galaxy modelling. The image simulations will become increasingly sophis- 
ticated in future GREAT challenges, meanwhile the GREAT08 simulations remain as 
a benchmark for additional developments in shear measurement algorithms. 

Key words: cosmology: observations - gravitational lensing - large-scale structure 
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1 INTRODUCTION 

A clump of matter induces a curvature in space-time which 
causes the trajectory of a light ray to appear bent. This ef- 
fect, known as gravitational lensing, is analogous to light 
passing through a sheet of glass of varying thickness such as 
a bathroom window. In both cases the light-emitting objects 
appear distorted. Making assumptions about the intrinsic 
(original) shapes of the emitting objects allows us to infer 
information about the intervening material. In cosmology 
we learn about the distribution of matter by studying the 
shapes of distant galaxies. In the vast majority of cases the 
distortion varies very little as a function of position on the 
galaxy image, and it can be approximated by a matrix dis- 
tortion. This regime is known as weak gravitational lensing, 
or cosmic shear when applied to large numbers of randomly 
selected distant galaxies. 

Gravitational attraction of ordinary matter and dark 
matter is expected to slow the expansion of the universe, 
causing the expansion to decelerate. However, multiple lines 
of evidence now show that the present day expansion of the 
Universe seems instead to be accelerating. The main ex- 
planations explored in the literature are that (i) Einstein's 
cosmological constant is non-zero, (ii) the vacuum energy is 
small but non-negligible, (iii) the Universe is filled with some 
new fiuid, dubbed dark energy, or (iv) the laws of General 
Relativity are wrong at large distances. Possibilities (i) and 
(ii) can be subsumed within item (iii) because they look like 
a dark energy fluid with equation of state p = wpc? where 
w = —1. To find out more about the nature of dark energy 
or modifications to the law of gravity we need high precision 
measurements of the recent [z < 1) Universe. 

By studying cosmic shear using galaxies at a range 
of different epochs we can learn how the dark mat- 
ter clumps as a function of time, which itself depends 
on the nature of dark energy and the laws of gravity. 
Cosmic shear appears to hold the most potential of all 
methods for inv estigating the dark energy or modifica- 
tions to gravity jA lbrecht et al."2006'; 'Peacock ct al.ll2006l : 
lAlbrecht fc Bernsteinii2007i : lAlbrecht et al...i,Albrecht et al. ) . 



There are many current, planned and proposed surveys 
to use cosmic shear to measure dark energy includ- 
ing the Canada-France Hawaii Telescope Legacy Survey 
(CFHTLS) B the Kilo-Degree Survey (KIDS), Panoramic 
Survey Telescope and Rapid Response System (Pan- 
STARRS) 0, the Dark Energy Survey (DES) 0, the Large 
Synoptic Survey Telescope (LSST) Q and space missions 
Euchd0 and the Joint Dark Energy Mission (JDEM) 0. 

Cosmic shear was first detected j ust one decade ago 
llBacon et al.ll200(]|: iKaiser et al.ll2000l : Ivan Waerbeke et all 
l200d : IWittman et al.1 |2000D and many studies have now 
used it to measure cosmological parameters. Much work 
has also been carried out on anticipating any prob- 
lems that may limit the potential of cosmic shear over 
the coming decade. These are thought to be (i) accu- 
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racy of approximate methods for obtaining distances to 
galaxies; (ii) intrinsic alignments of galaxies; (iii) accu- 
racy of numerical predictions of dark matter clustering 
on small scales and in the presence of baryons; and 
(iv) unbiased measurement of shear from galaxy images. 
There is now much discussion about obtaining high qual- 
ity galaxy distances using spectroscopic redshifts to cal- 
ibrate approximate methods to solve (i) jMa et al. 20051: 

a Is 



Huterer et al]|2 006a: Kitchi ng etHI l2008l : iBernstein fc Mai 



2008 



3" Bernstein fc Huterer, .20091 ). The intrinsic alignment 



signal (ii) can be removed if (i) can be solved per fectly 
(iTakada fc Whitd |2004 IJoachimi fc Schneider '2008^ and 
otherwise the two are closel^linkcd (King fc Schneider 2003; 
Sevm ans & Heavens' 2003; King 2005; Bridle & King 2003 
Zhana ,2008. : ,Bernstein. ■200a : .Joachimi fc Schneider .200^ 1. 



Supercomputers are being deployed to produce higher accu- 
racy predictions, and methods for suppressing information 
from the uncertain small-scale regime have been developed. 
In this paper we focus on the final problem, shear measure- 
ment from noisy images. It can be phrased entirely as a 
statistics problem of extracting information from images. 

In 2004 the Shear TEsting Programme (STEP) was 
launched to assess the current status of shear measurement 
methods. It began with a blind challenge set b y and for 
the weak lensing community (|Hevmans et al.ll2006l . hereafter 
STEPl). A large volume of images containing a mixture of 
stars and simple galaxies were produced. The participants 
had the task of extracting the (constant) input shear from 
the images, and these estimates were compared to the true 
input value. These end-to-end simulations showed that the 
shear measurement problem is far from trivial but that the 
methods in frequent use at that time were sufficiently accu- 
rate for the existing published cosmic shear measurements. 
iMassev et al] l|2007h (hereafter STEP2) extended this work 
with more sophisticated galaxy models, and built statistical 
devices into larger simulations to improve the measurement 
precision. This showed that, even considering realistic and 
more complex galaxy morphologies, existing methods were 
still sufficient for the current data. 

The cosmic shear community then began to look ahead 
to the coming decade of surveys and ask whether the ex- 
isting methods are sufficiently accurate even when the sta- 
tistical uncertainties are reduced by the massive increase in 
data quantity. Addressing this question requires much larger 
blind challenges, containing at least tens of millions of galax- 
ies. At the same time it was recognised that the shear esti- 
mation problem can be phrased as a statistics problem and 
that experts in image analysis from other disciplines may be 
in a position to contribute significantly to developing new 
approaches. Furthermore, it was decided that the strengths 
and weaknesses of different methods could be best assessed 
with slightly simpler simulations, in which various effects 
could be isolated. 

The previous two published blind shear analysis chal- 
lenges (STEPl, STEP2) were slightly simplified relative to 
real data in that the shear and the PSF did not vary across 
an image. However, they did ask participants to grapple with 
a number of difficult issues. 

• The images had relatively realistic PSFs with classical op- 
tical aberrations such as coma and trefoil. 

• Although the PSF did not vary across an image, partici- 
pants were asked not to use this fact. 
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• STEPl required participants to determine which objects 
were stars and therefore could be used for a PSF determi- 
nation. 

• Both challenges required participants to run object detec- 
tion software to determine where the star and galaxies were. 
Spuriously detected objects could and did affect the shear. 

• Galaxies were drawn from a range of magnitudes, so that 
weighting schemes as a function of the Signal-to-Noise Ratio 
(SNR) were important. 

• Galaxies were randomly placed, so that sometimes they 
overlapped. Participants were responsible for either deblend- 
ing or rejecting these galaxies. 

The GREAT08 Challenge removes all of these issues to focus 
on the core problem of inferring shear given a PSF and stan- 
dardised set of non-overlapping galaxies at (approximately) 
known positions. The motivation is that once this problem 
is solved, the other issues will be introduced in further chal- 
lenges of increasing complexity. 

The Gravitational LEnsing Accuracy Testing 2008 
(GREAT08) Challenge Handbook (jBridle et al.ll2009i . here- 
after The GREAT08 Handbook) describes the shear mea- 
surement problem for non-cosmologists and sets out the 
chaUenge. GREAT08 was launched in October 2008 and 
ran as a blind competition for 6 months until the end of 
April 2009. This paper describes the results of GREAT08. 
Section [2] describes the GREAT08 simulations. We review 
the shear measurement problem and shear accuracy require- 
ments in SectionO Section|4]summarises current shear mea- 
surement methods and Section [5] presents the Challenge re- 
sults. We conclude and overview the potential for future 
GREAT Challenges in Section |6l We provide extra details 
of the simulations, methods and results in appendices. 



2 THE GREAT08 SIMULATIONS 

The GREAT08 images are provided in sets of 10,000 objects 
in a single FITS file. Each object is generated on its own grid 
of 39 X 39 pixels and these postage stamps are patched to- 
gether for convenience in a 100 x 100 layout, with a 1 pixel 
border, thus each set is a patchwork image of 4000x4000 
pixels. Each galaxy postage stamp is generated using the 
following sequence: (i) simulate a galaxy model; (ii) con- 
volve it with a kernel, referred to as the point-spread func- 
tion (PSF); (iii) bin up the light in pixels; and (iv) apply the 
noise model. The PSFs used are given in Appendix lAll Each 
postage stamp is produced using a list of parameters spec- 
ifying the individual object and simulation properties. We 
describe the catalogues of these properties in Appendix I A2I 
The method used to produce images from the catalogues 
is overviewed below and described in more detail in Ap- 
pendix [XS] Example images are shown in Fig. [T] 

Four different groups of galaxy images were pro- 
vided in GREAT08; (i) low noise galaxy images for 
which the true shears were provided during the Chal- 
lenge, labelled LowNoise_Known; (ii) low noise galaxy im- 
ages for which there was a blind challenge to extract the 
true shears, labelled LowNoise_Blind; (iii) realistic noise 
galaxy images for which the true shears were provided, 
labelled RealNoise_Known; and (iv) realistic noise galaxy 
images with blind shear values, RealNoise_Blind. This 
RealNoise_Blind group formed the main GREAT08 Chal- 



Table 1. Parameters for the LowNoise_Known simulations. 
Rgp/Rp is the ratio of PSF convolved galaxy Full Width at Half 
Maximum (FWHM) to the PSF FWHM. 'b or d' describes the 
fact that 50% of the galaxies in each set have de Vaucouleurs 
profiles (bulge only) and 50% have exponential profiles (disk 
only). The parameters for LowNoise_Blind are the same except 
the galaxies are a mix of the two components as described in the 
text. The parameters for RealNoise_Known are the same as for 
LowNoise_Known except the SNR is 20. 





Fiducial 


Lower value 


Upper value 


SNR 


200 


N/A 


N/A 


Rgp/Rp 


1.4 


1.22 


1.6 


PSF type 


Fid 


N/A 


N/A 


Galaxy type 


b or d 


N/A 


N/A 



Table 2. Parameters for the RealNoise_Blind simulations. The 
PSF models and other parameters are defined in detail in Appen- 
dices |Al] and |A2] 





Fiducial 


Lower value 


Upper value 


SNR 


20 


10 


40 


^gp/^p 


1.4 


1.22 


1.6 


PSF type 


Fid 


Fid rotated 


Fid ex 2 


Galaxy type 


b+d 


b or d 


b+d offcenter 



lenge. These are described in more detail in the GREAT08 
Handbook, together with the rules governing which infor- 
mation could be used to inform the blind challenges. 

The parameters for each set in LowNoise_Known were 
determined using the upper panel of Fig. [2] and Table [1] 
There are 15 sets (FITS images) each containing 10,000 
galaxies. There are 5 sets with each of 3 different galaxy 
size values. The method for setting the galaxy sizes and 
SNR values is described in Appendices I A2I and I A3I 

The parameters for each set in RealNoise JBlind were de- 
termined using the lower panel of Fig.[2]and Tabled There is 
a range in SNR, galaxy size, PSF ellipticity and galaxy type. 
One branch of the RealNoise_Blind holds all parameters at 
their fiducial values. Each of the 4 variable parameters has 
a 'lower' and an 'upper' value relative to the fiducial. When 
each of these values is used all other parameters are fixed at 
the fiducial values. This makes 9 different branches in total. 
In each branch there are 6 realisations of each of 50 different 
shear values, making 2700 sets with 10,000 galaxies in each. 

Images are generated by sampling from the galaxy light 
distribution, sampling from the PSF, adding the sample po- 
sitions to simulate convolution, binning the samples onto a 
pixel grid, and then applying the noise model. The exact 
numerical techniques used are detailed in Appendix IA3I In 
brief, samples are first generated from the circular galaxy 
profile. Next, they are stretched to have the required ellip- 
ticity and then sheared. Samples are then drawn from the 
circular PSF distribution and made elliptical using the shear 
distortion equations given in Appendix IA3I Each galaxy 
sample is added to a PSF sample to simulate convolution, 
and finally the samples are binned into pixels. 
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Figure 1. Left: The first galaxy of the first LowNoise_Known FITS image. Right: The first galaxy of the first RealNoise_Known FITS 
image. The signal is a factor of ten smaller for the RealNoise images than the LowNoise images, making the problem much more 
challenging. 



3 FIGURE OF MERIT 

The shear measurement problem was summarised for non- 
cosmologists in the GREAT08 Handbook. In short, hght 
from a source galaxy is sheared and (slightly) magnified by 
passing through a gravitational potential on its way to the 
observer; the observable anisotropic stretching is called the 
reduced shear g, which is a pseudovector with two compo- 
nents. (Because the distinction between shear and reduced 
shear is not important in the context of this paper, which is 
aimed at both the astronomical and statistical communities, 
we refer to g as simply "shear" for convenience.) 

Shear measurements are confounded by several un- 
avoidable observational effects. First, for ground-based tele- 
scopes, when the light passes through the atmosphere it is 
convolved with a kernel that must be inferred from the data. 
Second, telescope optics (whether in space or on the ground) 
also cause the image to be convolved with a kernel; this ker- 
nel may be more predictable than the atmospheric kernel 
because the optics may be well modeled. In any case, the ef- 
fective kernel imposed by atmosphere and optics is referred 
to as the point-spread function (PSF). Third, emission from 
the sky causes a roughly constant "background" level to be 
added to the whole image. Fourth, the detectors sum the 
light falling in each pixel, effectively convolving the image 
with a square tophat window function, and sampling the 
resulting image at the center of each pixel. This extra con- 
volution effect is treated by some authors as part of the PSF. 
Fifth, the finite number of photons collected in a given pixel 
is subject to Poisson noise (in addition the final detector 
readout adds Gaussian noise of zero mean, but this is ig- 
nored in GREAT08). 

Thus a successful method must both filter the noise 
effectively and remove the significant PSF convolution ker- 
nel in the observed galaxy image. To represent a method's 
ability to perform both tasks in a single number for the 
GREAT08 Challenge, we define a quality metric 



Q = 



10" 



(1) 



where is the ith component of the measured shear for 
simulation j, glj is the corresponding true shear compo- 



nent, the inner angle brackets denote an average over sets 
with similar shear value and observing conditions j G k, and 
the outer angle brackets denote an average over simulations 
with different true shears fc, observing conditions I and shear 
components i. 

In our detailed discussion of the results below we also 
define a Q value for each simulation branch. In this case the 
average over different observing conditions k is omitted 



Q 



therefore 



Q {Qi)i' 



(2) 



(3) 



This definition has the efi'ect of strongly penalising methods 
that perform poorly in any single simulation branch, which 
is useful because the simulation branches are all chosen to 
be realistic scenarios in which we need to be able to measure 
good shears. For a method to be used for all future analyses 
it must work well on all branches of the simulations. In par- 
ticular, there are many small and low SNR galaxies that we 
would like to use for cosmic shear cosmology. However, the 
purpose of this results paper is to examine the performance 
of the different methods on the different branches in detail 
rather than relying on a single number Q to differentiate 
between methods. 

To set this metric in context, if a single constant value of 
zero shear were submitted {g^j — gJJ = for all j) then since 

the rms true shear 



'{9ij)ii ~ 0.03, Q would have a value 
~ 0.1. To date, methods tested in STEPl a nd STEP2 and 
used on real data have Q ~ 10 to Q ~ 100 jKitching et al.l 
boos), which is sufficient for the surveys on which they were 
employed but not sufficient for mid-term to far future sur- 
veys. 



lAmara &: Refreeieil ()2008l ) show that a deep full-sky 
(e.g. Euclid-like) survey requires that the additive error c < 
0.0003 and the multiplicative error m < 0.001. For a pure 
additive error this translates to a requirement that Q > 1000 
and we set this as our target for GREAT08 because additive 
errors are much more difficult to self-calibrate using pairs of 
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GREAT08 LowNoise Blind 



Fiducial 



Shear 1 




Shear 5 



Similarly for Fiducial, Upper 
(but shears are different). 



Realisation 1 



Galaxy 1 


Galaxy 2 




Galaxy 5000 


"90 rot 


"90 rot 




"90 rot 



> Similarly for Shear 2 to 5 



GREAT08 RealNoise Blind 



Fiducial 



SNR 




PSF type 



Galaxy type 



Lower 


Upper 


Lower 


Upper 


Lower 


Upper 


Lower 


Upper 



Shear 1 


Shear 2 


Shear 3 




Shear 50 



Similarly for Fiducial, 
Lower SNR, Upper SNR, ... 



T 



Realisation 1 



1 



Realisation 6 



Similarly for Shear 2 to 50 



Galaxy 1 


Galaxy 2 




Galaxy 5000 


"90 rot 


"90 rot 




"90 rot 



Similarly for 
Realisation 2 to 6 



Figure 2. Upper panel: Schematic of tlie galaxy parameters used in LowNoise_Blind. Each realisation corresponds to a different set 
or FITS image file containing 10,000 galaxies. The schematic looks identical for LowNoise_Known. For RealNoise_Known there are 100 
shears per branch in place of 5. The bottom row of boxes represents galaxies with the same properties as the penultimate row of boxes, 
but rotated by 90 degrees. Lower panel: Schematic of the galaxy parameters used in RealNoise_Blind. 
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tomosraphic redshift bins (jHuterer et alj l2006bl ) (see also 
I Van Waerbeke et al.ll200^ ). A detailed analysis of the two 
separate terms is given in Appendix [Cl 

As defined, Q penalises deviations from truth regardless 
of whether they are random or systematic. This is useful for 
selecting a winner, but much can be learned by separating 
errors into random and systematic parts. For the system- 
atic part we follow STEPl and STEP2 by defining a mul- 
tiplicative error m and an additive error c as the best-fit 
parameters to 



= miQi + a 



(4) 



We show some results for the average of the two components 
m = {mi)i, c = {ci)i For a given method, changes in m and 
c across simulation branches may indicate the strengths and 
weaknesses of the method. 

Participants may optionally submit uncertainty esti- 
mates on their shears. These are compared to the residu- 
als of the submitted shears over sets of simulations with 
nearly identical true shear values. If the uncertainty esti- 
mates are wrong by more than a factor of two, the sub- 
mission is flagged as such, but is not penalised. The main 
purpose of GREAT08 is to produce a high Q value rather 
than yield correct uncertainty estimates. 

A method is not useful if it obtains very small shear 
biases at the expense of throwing away most of the informa- 
tion and thus very noisy shear estimates. The quality factor 
Q will be worse if a method has very noisy shear estimates 
because the rms difference between the truth and submission 
will be non-negligible even if the biases are zero. We there- 
fore calculate the scatter of the submitted shear values about 
the best linear fit to the true shears. Specifically, we plot sub- 
mitted g\ values as a function of true g\ , with one point for 
each FITS file and fit the straight line described above. We 
find the rms residual to obtain the scatter cri in the first 
component g\. We repeat for 172 and write a = {ai)i aver- 
aging over the two shear components i. See iKitching et al] 
(|2008h for additional discussion. 



4 METHODS 

In this section we briefiy summarise the algorithm used by 
each submitting group. Table [3] lists the participants, their 
methods, and the corresponding identifiers used in subse- 
quent tables and in the figure legends. Methods with an as- 
terisk indicate GREAT08 Team entries; these participants 
had access to the internal details of the GREAT08 Chal- 
lenge simulations, but they did not consciously use this in- 
formation in their analyses. Entries from PG, MV had some 
overlap with the GREAT08 Team. Not all submitting groups 
submitted results for both types of Blind simulation. An ad- 
ditional table (Table [BT|) in Appendix |B] gives further infor- 
mation including urls where more information can be found. 

For a quick overview we attempt to summarise each 
method with just three action steps in Table O We see that 
a key differentiating factor is the stage at which an average 
is performed over galaxies in the image. HB, AL and USQM 
as "stacking" methods hereafter. The two different routes 
are illustrated in Fig. [3] 

STEP2 classified methods according to their methods 
for PSF correction and construction of a shear estimator. 




Stacking (e.g. i 
Fourier domain) 




Model fitting 
(e.g. shapelets) 



Model fitting 
(e.g. spline) 



EUipticities 



Averaging 



Shear (g) 



Figure 3. Illustration of the difi'erent routes to a combined shear 
statistic from multiple galaxies. The lower left route is the tradi- 
tional approach in which each galaxy image is analysed separately 
to produce a shear estimate. The upper right route illustrates the 
"stacking" methods which average some statistic of each image 
and perform shear estimation on the averaged statistic. 



PSF "deconvolution" methods convolve a model with the 
PSF before fitting as indicated by "* PSF" in the table; 
PSF "subtraction" methods subtract a contribution due to 
the size and ellipticity of the PSF. "Active" shear measure- 
ment methods sheared a "circular" galaxy model until it best 
matched the data, generally indicated by the word "fit" in 
the action list; "passive" methods constructed a shear es- 
timator from a combination of shape statistics and an es- 
timate of how these would further change under a shear. 
This classification system proved insufficient to capture the 
more varied behaviour of methods containing new ideas in 
GREAT08. We next summarise each method in turn, in or- 
der of decreasing Q value on RealNoise_Blind. 

HB: The magnitude of the Fourier transform of the 
galaxy image raised to an arbitrary power is a character- 
istic feature of the individual galaxies. This feature is in- 
dependent of the spatial location of the galaxy center to a 
high precision, provided that the smoothed galaxy intensity 
decays sufficiently fast towards the edge of the image. No 
other assumptions are necessary. Because the galaxy images 
are contaminated by Poisson noise, an unbiased estimator of 
the power spectrum is given by the power spectrum of the 
noisy image minus a constant. The resulting image obtained 
by averaging over the unbiased estimators of the individual 
galaxy power spectra is an elliptically contoured function 
multiplied by the power spectrum of the convolution ker- 
nel plus Gaussian noise. After suitable normalization, the 
square root of the covariance matrix of the elliptically con- 
toured function is equal to the shear coordinate transfor- 
mation matrix. For parameter fitting, HB used a weighted 
non-linear least square method for which the weights are 
equal to the inverse of th e standard deviation of th e noise. 
For more information see iHosseini fc Bethgd l|2009l ). 
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ParticipEint (s) 




Action 1 


Action 2 


Acxion o 


Hosseini, Bethge 


KB 


Estimate power spectrum 


Average power spectra 


Fit elliptical model * PSF 


Lewis 


AL 


Estimate centroids 


Average images 


Fit elliptical model * PSF 


Kitching 


TKt 


Fit elliptical model * PSF 


Combine ellipticity PDFs 


Calculate shear 


Heymans 


CHt 


Measure weighted quadrupole moments 


Correct for weight and PSF 


Average shear estimates 


Paulin, Gentile 


PG 


Fit elliptical model * PSF 




Average shear estimates 


Velander 


MV 


Fit flexed elliptical model * PSF 




Average shear estimates 


Kuijken 


KKt 


Fit elliptical model * PSF 




Average shear estimates 


Harmeling, Hirsch, Scholkopf HHS3 


Estimate centroids 


Average good images 


Fit elliptical model * PSF 


Bridle 


SBt 


Fit pllintirFil mndpl * PSF 






Harmeling, Hirsch, Scholkopf HHS2 


Estimate centroids 


Average images 


Fit elliptical model * PSF 


Harmeling, Hirsch, Scholkopf HHSl 


Fit elliptical Gaussian 


Correct for model and PSF 


Average shear estimates 


Jarvis 


Mjt 


Fit "elliptical" model * PSF 




Average shear estimates 


Bridle, Schrabback 


USQMt 


Measure quadrupole moments - PSF 


Average quadrupole moments 


Calculate shear 



Table 3. Table of participants, figure legend identifiers and pseudo-code which attempts to summarise the main actions carried out in 
each method. "* PSF" indicates that a PSF convolved model was fitted. "PDF" stands for probability density function. Daggers after 
the Key indicate GREAT08 Team entries. More information is provided in the main text and in Appendix [B] 



AL: This method was inspired by iKuiikenI (|l999l ) and 
is described in iLewisI (|2009l ). Centroids for each galaxy are 
determined and all galaxies in a FITS image are stacked 
on a sub-pixel scale. A PSF convolved elliptical profile is 
fitted to this stacked image, a nd the elliptic ity corresponds 
to the shear. As pointed out in iLewisI l|2009l ). the advantage 
of this approach is that the individual non-elliptical shapes 
of individual galaxies are averaged out. This fact was taken 
advantage of in HB, HHS2 and HHS3. 

TK: The Lensfit code fits a sum of co-elliptical exponen- 
tial and de Vaucouleurs models to each individual galaxy 
and the best fit ellipticity is found. The bulge (de Vau- 
couleurs component) to disk (exponential component) frac- 
tion is a free parameter in the fit. The shear is calculated 
using a Bayesian estimator. For more d etails see Appendix 
F o f the GREAT08 Hand book and also iMiller et all (|2007l ') 
and lKitching et akl (|2008l ) The version used here differs from 
the previously published implementations by including sub- 
pixel estimation of galaxy positions and adaptive ellipticity 
grid refinement. 

CH: An imp lementation of the longstanding KSB 
IIKaiser et al. lUgg?) method, which is the most widely used 
code on observational data. For more information, see Ap- 
pendix C of the GREAT08 Handbook. 

PG: For each galaxy, a 6-parameter Sersic model is con- 
volved with the PSF and pixellated. This is fitted to the im- 
age through minimization using the gradient-expansion 
algorithm by LevenbergMarquardt. The six fitted parame- 
ters are: the centroid (2 parameters), the magnitude, the 
size, and the ellipticity (2 parameters) . The estimated shear 
of an individual galaxy is derived from its fitted parame- 
ters and the averaged shear over a number of galaxies is the 
average of individual shears. 

MV: This method is an extension of the KK method de- 



scribed below. It is being developed with the aim of measur- 
ing higher order galaxy image distortions, known as flexion, 
as well as shear. These higher order distortions add impor- 
tant detail to the measurement of galaxy halo density pro- 
files and to dark matter mapping. For more information on 
this method see Vel ander fc Kuijken in prep, and for further 
detail on fiexion see lBacon et al.l (|2006l ). 

KK: Each individual galaxy is modelled as a sheared, 
circular source described by means of the first-order shear 
operators in shapelet space. The PSF is also modelled as a 
high-order shapelet expansion, and all convolutio ns are car- 
ried out in shapelet space using the prescriptions inlRefregieil 
(2003). For further information see iKuiikenl l|2006t ) and Ap- 
pendix D of the GREAT08 Handbook. 

HHS1/HHS2/HHS3: In HHSl an elliptical Gaussian is 
fitted to each galaxy image by minimizing the mean-squared 
error via gradient descent in the 6 model parameters. As 
in SB, the average ellipticity is taken as an estimate for 
the shear. Due to the simplified galaxy model and the PSF 
blur a systematic bias is introduced, which is corrected for 
by off-setting the ellipticity values and via calibration using 
the training data. The methods HHS2 and HHS3 aim to be 
more robust by adopting the idea of AL to stack all galaxy 
images within one FITS file on a subpixel scale in order to 
increase the SNR. In addition, in HHS3 corrupted images 
were removed before stacking. 

SB: The imSshape code models each individual galaxy 
as a sum of co-elliptical Gaussians. The parameters are 
marginalised using MCMC sampling and the mean elliptic- 
ity of the samples is taken to correspond to the shear. For 
computational speed, only 16x 16 pixels in the center of each 
postage stamp were used in the fit. Se e Appe ndix E of the 
GREAT08 Handbook and lBridle et all (j2002l ). 

MJ: This algorithm seeks a coordinate system in which 
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USQM 



1.22 1.4 1.6 

R /R 
gp p 

Figure 4. Our figure of merit Q as a function of galaxy size for 
LowNoise_Blind. 



Rank ID Method Q 



\ 


HHSl 




488 


2 


AL 


PT T KKQQ 


375 


3 


PG 


gni 


136 


4 


TK 




33 7 


5 


CH 


KSBf90 


32.4 


6 


MV 


KKshapelets with flexion 


21.2 


7 


MJ 


BJ02 deconvolved sliapelets 


20.2 


8 


KK 


KKshapelets 


19.7 


9 


SB 


im2shape 


15.3 


10 


USQM 


USQM 


1.84 



a model of the galaxy is found to be round. The model 
is convolved by the PSF and then compared to the ob- 
served pixel intensities. A shapelet decomposition is used for 
the underlying model, and roundness is defined as the sec- 
ond order shapelet coefficients being 0. Then the shear that 
brings this coordinate system back to the actual observation 
is assign ed as the shape of t he ga l axy. For more informa- 
tion s ee iBernstein fc JarvisI (|2002l ). iNakaiima fc BernsteinI 
|2003) and Appendix D of the GREAT08 Handbook. 

USQM: This is a very simple method, not actually used 
in practice, but provided as a baseline comparison. The un- 
weighted quadrupole moments of each galaxy are calculated 
within a square aperture of 20 pixels by 20 pixels. These 
are averaged (stacked) over all galaxies in each FITS image 
and the PSF is removed by subtracting the PSF quadrupole 
moments. See Appendix B of the GREAT08 Handbook for 
more information. 

In terms of the nomenclature introduced in STEP2 most 
of the methods forward fit an elliptical PSF convolved model 
("active", "deconvolution" ) . This is in contrast to the situa- 
tion in STEPl and STEP2 where the majority of the meth- 
ods were "passive" PSF subtraction methods. There were 
no stacking methods in STEPl or STEP2. 



5 RESULTS 

There were two blind challenges: LowNoise_Blind contains 
high SNR images and RealNoise_Blind contains images with 
a realistic noise level. The GREAT08 Challenge prize for 
highest Q value is based on the RealNoiseJBlind results. 
The LowNoise_Blind competition contained significantly less 
data and should have been an easier challenge. Further- 
more, the galaxy properties in LowNoise_Blind were simi- 
lar to those in RealNoise_Blind and are mostly co-centered 
bulge plus disk models. It could therefore have been useful to 
optimise some properties of methods on the LowNoise_Blind 
images in preparation for RealNoise_Blind. First, we exam- 
ine the LowNoise_Blind results. 

5.1 LowNoise Blind Results 

Table |4] shows the LowNoise_Blind leaderboard at the close 
of the challenge. The winner in LowNoise_Blind is the Gauss 
method of S. Harmeling, M. Hirsch, and B. Scholkopf. The 



Table 4. LowNoise_Blind leaderboard at the close of the chal- 
lenge. See Table [3l and Section|4]for more information about each 
method. 



top three methods in LowNoise_Bhnd are not GREAT08 
Team methods. Note that HB did not submit a result for 
LowNoise_Blind. 

Fig. |4] shows our shear measurement figure of merit 
Q as a function of the ratio between the convolved galaxy 
size and the PSF size, Rgp/Rp. Since the number of galax- 
ies decreases steeply as a function of galaxy size in real 
data, it is desirable to have a shear measurement method 
that allows the use of small galaxies. It is often assumed 
that shear measurement biases are larger for small galax- 
ies. There a re some examples where this is true in STEP2 
Fig. 7, and INakaiima fc Bernsteii] (HqO^) Fig. 5. However 
the shear biases are caused by a combination of two effects: 
a poorly measured PSF and inherent biases that exist even 
if the PSF is perfectly known. It is expected that an incor- 
rect PSF model will affect small galaxies the most, since for 
the largest galaxies the PSF h as little effect (e.g. Eq. 13 of 
iPauUn-Henriksson et al]|2008l ). In GREAT08 the exact PSF 
equation is known and if this information is properly used 
then the results will tell us about the inherent biases, for 
which there are less clear expectations. 

HHSl (dashed magenta line in Fig. |4ll is the clear win- 
ner overall in LowNoise_Blind and wins at both the fiducial 
and small galaxy sizes. The implementation of KSB by CH 
(solid green line in Fig. |4)| provided the best performance 
for highly resolved galaxies. As discussed above, this gen- 
eral trend of increasing Q with increasing galaxy size was 
expected, and is followed for many methods. The winning 
method HHSl performed worse as the galaxy size increased 
for LowNoise_Blind. We suggest that the method for cali- 
brating the ellipticities for the PSF blurring was less reliable 
at large galaxy sizes due to the fact that the large elliptical 
galaxies sometimes extend beyond the 39 x 39 pixel postage 
stamp. 

Further analysis of the LowNoise_Blind results in terms 
of multiplicative and additive shear calibration biases can 
be found in Appendix [Cl] 
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Figure 5. Shear measurement figure of merit Q as a function of simulation properties for RealNoise_Blind. 



RankAuthor 


Method 


Q 


1 


HB 


CVN Fourier 


211 


2 


AL 


KK99 


131 


3 


TK 


Lensfit 


119 


4 


CH 


KSBf90 


52.3 


5 


PG 


gfit 


32.0 


6 


MV 


KKshapelets with flexion 


28.6 


7 


KK 


KKshapelets 


23.0 


8 


HHS3 


GaussStackForwardGaussCleaned 


22.4 


9 


SB 


im2shape 


20.1 


10 


HHS2 


GaussStackForwardGauss 


19.9 


11 


HHSl 


Gauss 


12.8 


12 


MJ 


BJ02 deconvolved shapelets 


9.80 


13 


USQM 


USQM 


1.22 



Table 5. RealNoise_Blind loaderboard at the close of the chal- 
lenge. 



5.2 RealNoise Blind Results 

The main challenge consisted of 27 million galaxies with 
roughly a factor of 10 more noise per pixel, corresponding 
to the type of image that we will ultimately want to use 
for cosmic shear. The RealNoise_Blind leaderboard at the 
close of the challenge is shown in Table [S] The winner of the 
GREAT08 Challenge is clearly the 'CVN Fourier' method 
by R. Hosseini and M. Bethge, HB. This method was in- 
spired by the second-place AL method , bu t improves on a 
key limitation which was highlighted bvl Lewisi ()2009i ) in that 
it did not depend on the galaxy centroid. 

Fig. [5] shows Q as a function of galaxy type, PSF type, 
SNR, and galaxy size for RealNoise_Blind. The central, fidu- 
cial, value is the same on each of the four panels. Each point 
on the panels corresponds to a single set of conditions; for 
example, for the SNR= 10 point, all other parameters are 
set at the fiducial value. 

HB performs consistently well through all branches of 
the simulation, with significantly improved performance on 
the "b+d offcenter" galaxies. AL actually outperformed HB 
on six of the nine simulation branches, and obtains a Q value 
a factor of almost 4 larger than any other method for the 
fiducial simulation set, which is close to our target value of 
1000. AL was second overall mostly as a result of a poor 
performance on the low SNR branch, and to a lesser extent 
on the "Fid ex 2" PSF. It would be interesting to see if 
the results could be improved in either of these regimes. 
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for example with better centroiding at low SNR or better 
modeling of the "Fid ex 2" PSF. 

TK uses a model with coaligned exponential and de 
Vaucouleurs components which explains why the results on 
'b or d' are so good. It also does well on 'b+d offcenter'. If 
the galaxy model could be extended then this may improve 
the other results, which all use the fiducial galaxy type. KK 
also performs well on the "b or d" branch, and to a lesser 
extent, so does SB. Both these methods also assume galaxies 
have elliptical isophotes, which matches exactly the model 
in the simulation. 

The best method at the high SNR end of Real- 
Noise JBlind is MV (KK shapelets with flexion), which also 
performs well for the larger galaxies. HHSl on the larger 
galaxy branch is the only method on any branch to achieve 
greater than the Q ~ 1000 level required for future preci- 
sion surveys. This trend is surprising given that it reverses 
the trend with Rgp/Rp seen in LowNoise_Blind. It also ob- 
tains a good Q value at the high SNR end (SNR=40) of 
RealNoise_Blind, which is not surprising given the strong 
performance in LowNoise_Blind (SNR= 200) . 

Note that the absolute value of Q will depend on the 
noise on the shear measurements and on the number of re- 
alisations over which the average is performed. Therefore 
it is not terribly meaningful to compare values between 
LowNoise and RealNoise, however the m and c values can 
be usefully compared. These values are discussed for Real- 
Noise JBUnd in Appendix IC2I 



6 DISCUSSION 

The GREAT08 Challenge has moved shear measurement re- 
search significantly beyond STEPl and STEP2. We recog- 
nised that the shear measurement problem is intrinsically a 
statistical, not astronomical, problem and wrote a descrip- 
tion addressed at non-astronomers (the GREAT08 Hand- 
book). At the launch of the challenge we had achieved the 
following: 

• We moved from end-to-end simulations to simpler simula- 
tions which isolate a key difficult part of the shear measure- 
ment problem without confusion from other effects. 

• The simulations focus in on key areas of simulation param- 
eter space and allow a detailed assessment of the success of 
different methods in the various regimes explored. 

• We used a larger suite of simulations to assess methods at 
a much higher level of precision than was possible in STEPl 
and STEP2; this level of precision is appropriate for the 
most ambitious planned cosmic shear surveys. 

• The GREAT08 Team was formulated from the original 
STEP Team and new groups e.g. LensFit were incorporated 
and assessed as part of the blind competition. 

• We formulated a new figure of merit with which to assess 
the results of the challenge and provided active leaderboards 
during the challenge. 

• The GREAT08 Team codes were all made publically avail- 
able at the launch of the challenge. 

In addition to the six GREAT08 Team entries on the 
leaderboards at the start of the challenge there were five new 
entries which included computer scientists and non-lensers. 
The GREAT08 Challenge has therefore achieved its main 



goal of reaching out beyond the existing shear measurement 
community. 

The GREAT08 ChaUenge prize for the highest Q value 
in RealNoise_Blind went to Reshad Hosseini and Matthias 
Bethge (HB). The GREAT08 Team also awarded a prize for 
a significant contribution to advancing shear measurement 
methods to Antony Lewis (AL), specifically for superb re- 
sults over a significant range of simulation branches, and a 
timely summary of the problem that highlighted important 
issues l|Lewij[2009l l. Neither of these prizewinning groups 
are associated with existing lensing groups. 

The shear measurement problem has been invigorated 
by the Challenge and by the new ideas brought in. The most 
important new ideas are 

• a consideration of the impact of the assumed galaxy model 
on the accuracy of shear measurements; 

• a reconsideration of the stage in the measurement process 
at which to average observational quantities. 

The assumed galaxy model has recently been sh own to 
be important in causing biases in shear measurem ent (|Lewisl 
l2009l : IVoigt fc Bridie l2009l : iMelchior et al.l l2009l) . The exis- 
tence of this bias was first pointed out by [Lewis (2009) and 
this was the motivation for using a "stacking" method by 
both AL, HB and HHS2/3. In both methods the individ- 
ual galaxy properties are averaged away before a model is 
fitted, by averaging together simple statistics of the galaxy 
images. AL pointed out that averaging together the images 
themselves is not fully independent of the galaxy model, the 
PSF or the shear because a centroid must be estimated be- 
fore stacking. HB solved this by instead stacking two-point 
statistics of the image (specifically the power spectrum), 
which is insensitive to the centroid. This raises the general 
question of what quantity should be averaged (or otherwise 
combined), and at what stage, when presented with many 
galaxy images all with the same shear value. 

The success of the stacking methods on images with 
constant galaxy properties leads to questions about how well 
stacking could work on more realistic data. Because shear 
varies with position in real data, the stacking process will 
average the shear signal as well as nullify the observation 
effects it was designed to remove. However, we speculate 
that the average shear in a patch of sky is still a useful cos- 
mological quantity, as has sometimes been considered (e.g. 
most rec ently the top h at shear variance statistic shown in 
Fig. 5 of IFu et al.l l2008l) (see also cosmic shea, r ring statis- 
tics described in [Schneider fc Kilbingei|[2007l : lEifler et al.l 
l2009h . For lensing analyses of clusters or galaxies, the as- 
sumption of axisymmetry is often made which lends itself 
naturally to stacking in annuli about the center of the clus- 
ter. It would also be necessary to determine how to properly 
stack galaxies with a range of SNR or PSF in a given patch 
of sky, and especially how to tackle galaxies with a range 
of redshifts, and thus a range of shears. For example, 3D 
lensing (Heavens 2003; Kitching ct al. 2008) is speciflcally 
designed to take into account the probability distributions 
in redshift and shear for each galaxy separately. 

The results of GREAT08 show that different methods 
are successful in different corners of parameter space and 
many results are close to the target Q value of 1000. The 
results from different simulation branches give clues as to 
where methods could be improved and we expect to see fur- 
ther work on developing the methods. The winning method 
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HB only finished its first run two days before tiie cliallenge 
deadline and therefore it could be optimised further. In addi- 
tion it shows remarkably stable performance as a function of 
SNR implying that the good Q results might continue down 
to even lower SNR values. On the fiducial simulations AL 
achieved a Q value nearly four times higher than previous 
work, marking a significant improvement. The performance 
at low SNR is the clear next area for investigation for this 
method. TK obtains good results, in particular when the un- 
derlying model was similar to the model in the simulation. 

GREAT08 marks the first in a series of GREAT chal- 
lenges, which are intended to be a roadmap of simula- 
tions leading up to the real grand observational challenges 
that the community will face with the next generation 
of cosmic shear surveys. The next challenge in the series 
will be GREATIO. This will represent the next step to- 
wards creating fully realistic simulations. Many aspects of 
the GREATIO simulation will be familiar from GREAT08, 
though they will differ in some key aspects. The most sig- 
nificant change will be spatial variation: both the shear 
and PSF will vary across each image. GREATIO will also 
invite people to solve an extra cosmic shear challenge, 
estimating the convolution kernel from images to suffi- 
cient accuracy. For more information on GREATIO visit 
http:/ /www. greatlOchallenge. info. 
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APPENDIX A: DETAILS OF THE IMAGE 
SIMULATIONS 

Al PSF models 

In an attempt to isolate problems in the shear estimation 
pipelines and make the challenge more accessible we pro- 
vided maximal information about the PSFs used during the 
competition. 

The PSFs had a truncated Moffat profile 

1+ f^n r<r. 



Ivir) = 



(Al) 







r > = 



where we set j3 — 3.5. This profile is motivated by the com- 
bination of diffraction limited optics with random Gaussian 
blurring by the atmosphere and is therefore reasonably rep- 
resentative of PSFs for ground-based telescopes. The scale 
radius was determined by setting the Full Width at Half 



Maximum (FWHM) to 2.85 pixels, rc was set to twice the 
FWHM. Three different PSFs were used in the GREAT08 
Challenge, each with a different ellipticity, as shown in Ta- 
ble Ell 

Star catalogues consisted simply of the position of the 
point source. The x positions were drawn from a Gaussian of 
standard deviation 1.2 pixels centered on the middle of the 
postage stamp, similarly for the y positions. The star cata- 
logues were provided at the time of the challenge. The con- 
volution kernel and image generation method are described 
below. 



A2 Galaxy catalogue generation 

The information provided in this appendix subsection was 
not available during the Challenge. 

In general, the galaxies in GRE AT08 are t he sum of 
two components, each with a Sersic l|Sersidll968l ) intensity 
profile 







r < Are 
r >— Are 



(A2) 



where 7(r) is the amount of li ght per unit area at a radius 
r, and k ~ 2n — 0.331 (see e.g. Peng et al]|2002t ). The scale 
radius re and the total intensity (which determines lo) axe 
free parameters specified in the catalogues. The first compo- 
nent, with n = 4, is an approximation to the central bulge 
component of galaxies, corresponding to a de Vaucouleurs 
profile. The second component, with n = 1, is an approxi- 
mation to the exponential disk component of galaxies. Cir- 
cular galaxy images are made according to the profile I{r) 
described above and then distorted according to the galaxy 
ellipticity and shear as described below. 

The X and y positions of the bulge component were 
each drawn from a Gaussian of standard deviation 1.2 pixels 
centered on the middle of the postage stamp. By default 
the positions of the disk component were set equal to those 
of the bulge, except in one branch of the RealNoise_Blind 
simulations, as described below (see Table [Sjl. 

For each object, the total fiux (integral of I{r) over 
the postage stamp) in the disk component, as a fraction of 
the total fiux in both components, is in general a random 
number drawn from a uniform distribution between and 1. 
However, for LowNoise_Known, RealNoise_Known, and one 
branch of RealNoise_Blind, this fraction was set to either 
or 1. So, in these simulations, the galaxies had either a pure 
de Vaucouleurs or pure exponential profile. 

The scale radii re of each component were set by consid- 
ering high resolution circular galaxy images after convolu- 
tion with the appropriate PSF. For single-component models 
(i.e. when the bulge to total flux is zero or unity), is set 
such that the convolved image has a FWHM of 1.4 times 
that of the PSF, Fgp = lAFp, in the fiducial branch. Values 
1.22 or 1.6 were used for some other branches to explore 
the effect of galaxy size, as detailed below (see Tables [1] 
and[2|. The resulting re values for single-component models 
are provided in Table IA2I For two-component models the 
disk scale radius is a set multiple of the bulge scale radius, 
reM = 2re,i, * re,do/re,bo using values from Table K2l The 
bulge scale radius was set by simulating a high resolution 
two-component circular model with the required bulge to to- 
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Table A2. Galaxy scale radius values for single-component 
galaxy models. The left hand column gives the ratio of PSF con- 
volved galaxy FWHM to the PSF FWHM. The middle column 
gives the scale radius for a single component disk model. The 
right hand column gives the scale radius for a single component 
bulge model. These values are interpolated to produce scale ra- 
dius values for two-component models, as described in the text. 





Disk re,do 


Bulge r^^bo 


1.22 


0.82 


1.59 


1.4 


1.3 


3.8 


1.6 


2.4 


18.0 



tal flux ratio and finding the value such that the FWHM had 
the required value (by default 1.4 times the PSF FWHM). 
The ellipticities of the bulge and disk were drawn from 



P(e) = 



(A3) 



with B = 0.05, C = 0.58 for the bulge and B = 0.19, 
C = 0.58 for the disk; e = (a^ - 6^)/(a^ -f b^) where a 
and b are the major and minor axes respectively. Since el- 
lipticities close to unity become unphysical, we truncate the 
distribution at e = 0.9 and set all objects with e > 0.9 to 
have e = 0.9. This distributi on was loosely motivat ed by 
results from the APM survey l|Crittenden et al.ll20oil ): The 
bulge and disk ellipticities are drawn independently from the 
above distributions and are thus uncorrelated. The angle be- 
tween the bulge major axis and the positive x axis is drawn 
from a uniform distribution between and 180 degrees. The 
disk angle is equal to the bulge angle but perturbed by a 
Gaussian of standard deviation 20 degrees. 

Five thousand galaxy parameters were simulated per 
image set by drawing from the above distributions. To min- 
imise noise the parameters were all rotated by 90 degrees 
to produce the remaining 5000 galaxy parameters, (i.e. all 
angles are increased by 90 degrees, x positions become y po- 
sitions, and y positions become negative x positions.) The 
list was randomised to hide the pairings. This paired rota- 
tion was introduced in STEP2 to reduce shape noise. In the 
absence of a PSF or shear the shear estimates from each 
galaxy in a pair are expected to cancel, thus removing noise 
arising from the intrinsic ellipticities of galaxies. 

Signal-to-Noise Ratios (SNR) are assigned in the cata- 
logues and are used during image simulation to set the flux 
in the galaxy image. For LowNoise images the value is 200, 
and for RealNoise images the default value is 20, with vari- 
ations to 10 and 40 within RealNoise_Blind. The definition 
of this number in terms of the noise model is described in 
the following subsection. 

For LowNoise_Known and RealNoise_Known the galax- 
ies all have just a single component and within each set, each 
galaxy is assigned a de Vaucouleurs or an exponential profile 
at random. The galaxies in LowNoise_Blind all have a bulge 
plus disk two-component model as described in the text 
above. The majority of the galaxies in RealNoise JBlind have 
the same two-component model as in LowNoise_Blind. One 
of the nine RealNoise_Blind branches has single-component 
galaxies as in the Known simulations. The two-component 
models all share the same centroid for the bulge and disk, ex- 



cept for one of the nine RealNoise_Blind branches, in which 
the bulge is off-centered from the disk by a Gaussian of stan- 
dard deviation 0.3 pixels. 

The true shears for LowNoise_Known and 
RealNoise_Known were provided throughout the chal- 
lenge. They are Gaussian distributed with a stan- 
dard deviation of 0.03 in each of gi and (?2, and 
zero mean. The true shears for LowNoise_Blind and 
RealNoise_Blind have now been released, and are illus- 
trated in Fig. lAll These shears are perturbations around 
the root values gi = (-1,0,1,0,-1/^2) x 0.037 and 
g2 — (0, 0, 0, 1, — 1/\/2) X 0.037 and thus do not have zero 
mean. This distribution is chosen instead of a Gaussian to 
improve the uncertainties on linear fits to the output versus 
true shear. For LowNoise_Blind, one position in shear space 
is drawn from around each root and there is one set with 
this shear. For RealNoise_Blind, 50 positions in shear space 
are drawn from around each root and there are 6 sets with 
each shear, as illustrated in Fig. [2] 

A3 Image simulations 

The galaxy images are created according to the forward pro- 
cess using a Monte Carlo simulation technique. The general 
idea is that the intensity of a pixel in the image of a galaxy 
is directly proportional to the number of photons falling 
into that pixel. The photon count at each point depends 
on the intensity distribution (the light profile) of the galaxy. 
Therefore, if we draw random samples (photons) from the 
theoretical light profile function and then count the num- 
ber of photons falling in each pixel, we obtain the image 
of galaxy with the required light profile. The circular light 
profile thus obtained is then reshaped by applying the nec- 
essary transformations to the coordinates of the photons. 
Since the point-spread function (PSF) can be considered as 
a probability distribution, a similar method can be used to 
simulate it. The light profile of the galaxy is convolved with 
the PSF and finally pixelized into a FITS image. 

In general, any Monte-Carlo technique can be used for 
the simulation of the light profile. We use inverse transform 
sampling for this purpose. It is conceptually simple and gen- 
erally applicable for sampling from a one-dimensional prob- 
ability distribution. The basic principle is that, given a con- 
tinuous random variable U distributed uniformly in [0, 1] 
and a random variable X with cumulative distribution F, 
then X — F~^{U) has distribution F. In other words, to 
sample from X, we generate a random sample U and find 
the value of X at which the cumulative distribution is equal 
to U. 

In order to simulate the photons distributed by a Sersic 
Law, we need to find the cumulative distribution of the den- 
sity given by Equation IA2I Taking = 1 and substituting 
R = kr^^" , we obtain the cumulative distribution as 



F{R)^ 



V{2n,R) 
r(2n) ' 



(A4) 



where n is the Sersic index and T{a,x) is the incomplete 
Gamma function. The inverse of the distribution can be ap- 
proximately calculated by using linear interpolation, given 
that we have an ordered set of values of {R,F{R)} for the 
range of R (e.g., from to 20). 

The circular light profile of the galaxy obtained by the 
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Figure Al. True shears for LowNoise_Blind and RealNoise_Blind, color coded for the different branches of the simulations. 



method above is made larger, elliptical and rotated accord- 
ing to the values of scale radius re, axis ratio q and angle 
(f) respectively. These operations can be represented in the 
form of matrices as 







cos((^) 
sm{<l)) 





— sin((/)) 
cos((;/)) 



(A5) 



(A6) 



The shear from the gravitational tensing is applied next. 
This operation can be written as 



1 + 31 

32 



1-91 



(A7) 



For computational simplicity, we combine all of the above 
operations into a single matrix given by 

'e((l + 3i)c-S2s))/^ re{gic - {l-gi)s)/^ 
r-e^g ((1 + gi)s + g2c) (ff2S + (1 - gi)c) 

(A8) 

where c = cos (j) and s = sin 0. 

Having obtained the light profile of the galaxy, we move 
on to create a Moffat PSF and convolve it with the galaxy. 
Using a similar procedure to that described above for the 
Sersic profile, we can simulate Moffat PSF given by the 
Equation lAll Each sample from the PSF corresponds to the 
displacement of the photon when convolved with the galaxy. 
The circular galaxy can be scaled to the required FWHM 
and made elliptical by applying the transformation 



Vp 



1-ei 

-62 



-62 
1 + 61 



(A9) 



Assuming that the number of samples in the light profile 
and the PSF are the same, the convolution of the image is 
accomplished by adding the positions of the galaxy and PSF 
photons. The image is pixelized by counting the number of 
photons falling into each pixel of the postage stamp and then 
it is normalized. 

The galaxy images in GREAT08 contain two different 
light profiles. The final image is created by adding together 
two images with different light profiles. If I\ and I2 repre- 
sent two galaxy images with different light profiles, the final 



image /final is created by the equation 

/final = mh + (1 - m)l2, 



(AlO) 



where ^ m ^ 1 is a multiplication factor. Poisson noise is 
then added to each pixel according to the SNR. 

CCD detectors on ground-based telescopes collect a fi- 
nite number of photons from both astrophysical objects and 
atmospheric emission. We therefore mimic this effect by 
adding the background level i? = 1 x 10^ to each pixel, and 
drawing a number from a Poisson distribution with a mean 
equal to the total number (background plus galaxy) in each 
pixel. For numerical convenience we then subtract B from 
each pixel. For the RealNoise simulations, this background 
is much larger than the contribution from the galaxy, so 
this process is closely approximated by adding a Gaussian 
random number of standard deviation y/B with zero mean. 

Before the noise model is applied, the total fiux in the 
galaxy is set using the SNR given in the catalogue, and the 
background level discussed above. Details are given in the 
appendix, but in summary we define SNR as the fiux divided 
by the uncertainty in the flux obtained if the true shape (but 
not normalisation) of the object is known. 

For the purpose of the SNR calculations we approximate 
the Poisson noise as a Gaussian of standard deviation \/B 
for both LowNoise and RealNoise simulations. We follow the 
definition 



F 

SNR= — 



(All) 



where the flux F is the sum of the galaxy counts in each 
pixel li 



(A12) 



and ctf is the uncertainty in the flux. In general the un- 
certainty in the flux depends on the assumptions used to 
measure it. We make the assumption that the true galaxy 
shape (profile of counts in all the pixels) is known precisely 
up to an overall unknown scaling which is proportional to 
the fiux. By considering a fit it can then be shown that 



OF 



(A13) 
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and therefore the flux can be set such that 

j = SNRVb. (A14) 

We note that the images produced using the above el- 
hpticities and values give some very eUiptical images that 
extend beyond the 39 x 39 postage stamp. 



APPENDIX B: ADDITIONAL INFORMATION 
ON METHODS 

At the launch of the challenge the GREAT08 Team had 
put six results on the leaderboard, accompanied by a code 
wiki http://great08challenge.pbworks.com summarising 
the codes used and linking to downloadable versions of the 
code that was used on the GREAT08 simulations. Over the 
course of the challenge this wiki was updated by external 
GREAT08 participants, several of whom also provided their 
codes. The key elements of this code wiki are captured in 
Table IbTI 

APPENDIX C: DETAILED ANALYSIS OF 
RESULTS 

CI LowNoise_Blind 

The overall performance, as measured by Q, has contribu- 
tions from various competing effects. We break these up into 
a multiplicative bias m, an additive bias c and an rms dis- 
persion a, as defined in Section [3] For each of the three 
simulation branches in LowNoise_Blind we fit a straight line 
to a plot of submitted gi versus true gi values and identify 
the slope as (mi -I- 1) and ci as the offset. We repeat for 
g2 and average the multiplicative biases together to obtain 
an overall value m, similarly for the additive bias c. The 
scatter a is given by the standard deviation of the residuals. 
Note that although the 90 degree rotations in GREAT08 
substantially reduce the effect of shape noise, this would be 
a large additional contribution to the statistical uncertainty 
from realistic data, as it roughly adds in quadrature with 
the statistical scatter (at the level of about 0.2 per galaxy). 

The finite number of simulations means that these val- 
ues cannot be determined exactly. Therefore we also esti- 
mate uncertainties on the fitted multiplicative and additive 
biases from the submitted shear values. The uncertainty on 
m depends on the shear measurement method used and on 
the simulation properties. We calculate the uncertainty on 
the estimated rrii by calculating the likelihood as a func- 
tion of rrii and Ci and marginalising over Ci. We then calcu- 
late an average uncertainty on m over shear components i. 
Uncertainty decreased with increasing galaxy size for most 
methods, and the winning method HHSl had one of the 
smaller uncertainties on m, decreasing from 5 x 10"'' at 
Rgp/Rp = 1.22 to 2.3 X 10"^ at Rgp/Rp = 1.6. This may 
be compared to the multiplicative bias values m obtained 
by different groups, and we see that the uncertainty is small 
compared to at least one of the values obtained by each 
group and therefore is not the limiting factor in interpreting 
these results. 

The uncertainties on the additive biases ci and C2 also 



decrease with increasing galaxy size, as expected. At a given 
galaxy size they range over almost an order of magnitude for 
the different methods. A typically low uncertainty was ob- 
tained by HHSl across the range of galaxy sizes, and it varies 
from 10"" at Rgp/Rp = 1.22 to 3 x 10"^ at Rgp/Rp = 1.6. 
Again, this is much smaller than the additive shear biases 
seen by all groups for at least one galaxy size and is therefore 
not the limiting factor in obtaining small biases. 

Fig. ICll shows the multiplicative bias m and additive 
bias c as a function of Rgp/Rp for LowNoise_Blind. We 
now see that HHSl, performs less well at large galaxy sizes 
due to an increased multiplicative bias, indicating that the 
shears are overestimated for these galaxy sizes. In the Q plot 
(Fig. 21 the second highest method, AL (blue solid line), does 
best at the fiducial size and worse at larger and smaller sizes. 
On the more detailed figures of multiplicative and additive 
biases we see that the picture seems yet more curious, with 
good m and c values (close to zero) at small galaxy sizes, 
and becoming worse at large sizes. A more detailed analy- 
sis shows that the slight improvement at the fiducial galaxy 
size can be attributed to a partial cancelation between the 
effects of a negative m and a positive c. The third best result, 
PG, is relatively insensitive to the galaxy size; this effect is 
mirrored in the additive bias, which dominates the overall 
Q result since the multiplicative bias is relatively small. 

We see that the CH method acquired a very large posi- 
tive m at small Rgp/Rp, indicating a consistent ~ 12% over- 
estimation of the true shear when the galaxies are poorly 
resolved. TK has best performance all round on the fiducial 
model and this may be expected because LensFit was opti- 
mised to work well on typical galaxies used for cosmic shear, 
which therefore tends to coincide with the fiducial model 
used for GREAT08. The fact that MJ, SB and USQM consis- 
tently underestimate the shear is the dominant contribution 
to their poor performance. MV and KK both underestimate 
the shear at small Rgp/Rp, but overestimate the shear at 
moderate and large Rgp/Rp. Note that the MV method is 
an extension of the KK method, and the two performed very 
similarly in all the LowNoise_Blind plots. Several methods 
(KK, MV, USQM, CH) had the largest additive biases c for 
poorly resolved galaxies, which may suggest that the infor- 
mation about the true PSF model was not fully incorporated 
into their analyses. 

As discussed in Section |3l a successful method needs 
to produce reasonably low noise shear measurements, which 
we quantify by the scatter a, shown in Fig. ICll The scatter 
decreases as the galaxy size is increased, which is expected 
as information on the galaxy can be obtained from more im- 
age pixels. There is about an order of magnitude difference 
between the methods, with HHSl having a consistently low 
scatter around 10"''. Since there are 10,000 galaxies in each 
FITS file this corresponds to an uncertainty on the shear of 
each individual galaxy of 0.01, which is typical for a SNR of 
200. For LowNoise_Known there is only a single FITS file for 
each simulation branch, which means that there is no sum 
over files j in Eq. [T](i.e. j — 1). So, in the absence of other 
biases (m = c = 0) we would have Qi ~ 10~*/(7fe, where 
o-fe is the scatter for a single simulation branch. Therefore 
o-fe < 3x 10^* is required to reach the target of Qi ~ 1000 for 
a given simulation branch. Some methods have ~ 10~^ 
at the smallest galaxy sizes, which will limit their overall Q 
to around 100. 
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Table Bl. Table providing more details about the methods. This table collates information from the GREAT08 code wiki 
http://great08challenge.pbworks.coin on the programming language used, an indicative time taken per galaxy, and associated URLs. 
These runtimes are only illustrative since they are reproduced as provided by the code authors and no attempt has been made to 
benchmark or compare the machines used. The TK method takes 0.01 seconds using 8 threads, and these numbers are multiplied to give 
the number in the table, for ease of comparison with other methods. 
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Figure CI. Scatter, multiplicative and additive shear measurement bias as a function of galaxy size for LowNoise_Blind. 
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C2 RealNoise_Blind 

Fig. IC2I shows the output shear residuals versus the input 
shear for the top two methods, for the fiducial simulation 
branch. This figure illustrates how the multiplicative and 
additive errors are calculated. The total Q for a given sim- 
ulation branch is roughly a combination of the slopes and 
offsets of each best-fit line, and the scatter about the lines. 
The equivalent point for LowNoise_Blind has only five points 
on it, and the circles are identical to the crosses. 

We show the multiplicative and additive biases in 
Figs. [C3] and [Cl The decreased SNR in RealNoise_Blind 
is compensated for by averaging over many shear values to 
reduce the noise and ensure that the quality measures Q, m 
and c can be dominated by systematic biases. 

We first consider overall trends in multiplicative and 
additive biases (Figs. [C3l and [C4)l . The "psftype" panels in- 
dicate that changes in the PSF had virtually no efi'ect on 
m but quite a large effect on c. Incorrect estimation of the 
PSF size tends to cause a multiplicative bias, so given that 
the PSFs all had roughly the same size, and varied only in 
ellipticity, this result is not surprising. There is a general ten- 
dency for c to be best for the fiducial PSF, positive for the 
"PSF rot" and negative for "PSF ex 2". This tendency in- 
dicates that the participants made the most efforts to model 
the fiducial PSF, which is used for almost all of the simula- 
tions. It would be interesting to compare the observed trend 
with the result of wrongly assuming the fiducial PSF for the 
two other PSF branches in case this explains the result. 

The scatter of the submitted shears about the best fit 
line can be seen qualitatively by the range of the circles in 
Fig. IC2I and quantitatively for each simulation branch in 
Fig. IC3I Typical values around 10~^ are averaged down in 
the Q calculation in the average over j — 1, 300 simula- 
tions in a given simulation branch which have similar shear 
values. Therefore Qi ~ 300 x 10~* /a^, and ak should be less 
than about 5 x 10~^ for all simulation branches to prevent 
a method with m = c = from reaching Q ~ 1000. This 
condition is met by most methods even at the lowest SNR 
value. 

The uncertainties on the multiplicative bias are close to 
constant with respect to galaxy and PSF type and decrease 
with increasing SNR and galaxy size. With the exception 
of USQM, there is little scatter between the groups for a 
given simulation branch (tens of percent difference) , and the 
smallest uncertainties are obtained by AL and TK. For these 
methods, since the uncertainty on m is always less than 
10~^, we infer that the finite number of simulations is not 
the dominant reason that every submission departs from 
zero multiplicative bias for at least one simulation branch. 
The uncertainties on the additive bias are always less than 
2 X 10~* for the best methods and therefore also do not 
dominate the biggest departures from perfection. 

For the method HB, the multiplicative calibration bias 
(upper panels. Fig. IC3|I is very close to constant with simu- 
lation branch. The shears are consistently overestimated by 
about 2 per cent. This bias is above our detailed simplis- 
tic requirements for far future experiments, but note that 
if a method really did have a multiplicative bias that was 
completely constant with the properties of the simulation or 
universe, then it would be trivially removed by dividing all 
shears by the relevant number. The additive calibration bias 



for this method is always below our detailed requirement of 
0.0003 for far future experiments, except for the "Fid ro- 
tated" PSF branch and the low SNR branch. It would be 
intriguing to know if this could be fixed further by more 
detailed modeling of the PSF. 

The poor performance of AL on the low SNR branch 
appears to come mostly from a multiplicative bias of nearly 
10% (Fig. [5]). The results on the most elliptical PSF ("Fid 
ex 2") are also relatively disappointing, and come from the 
large additive calibration bias (Fig. [Sjl . This result is con- 
sistent with a problem with modeling this particular PSF, 
in which residual PSF ellipticity remains to add to the true 
shear. 

The good results of MV at high SNR and large galaxy 
size is largely due to the reduction in multiplicative bias in 
these regimes. This result could possibly hint at inaccurate 
modeling of the PSF size. 

The poorer performance for smaller galaxy sizes for 
HHS now seems to come from both an increased multiplica- 
tive and additive error. The multiplicative bias increases 
slightly as a function of galaxy size in LowNoise_Blind but 
decreases as a function of galaxy size in RealNoise_Blind. 
Perhaps there is some kind of cancelation between the in- 
creasingly negative multiplicative bias as a function of SNR 
and the large positive multiplicative bias seen in Real- 
Noise_Blind at smaller galaxy sizes. IIHS2 and IIIIS3 used 
stacking to decrease dependence on the assumed galaxy 
model. The additive calibration bias is still significant and 
reduces the overall Q value. The sharp changes in additive 
calibration bias with PSF type suggest that the PSF is not 
being sufficiently well modelled. 

In STEP2 there was found to be a systematic difference 
between mi and m2 that was attributed to the different 
effective pixel scales in the two directions. We have made 
separate figures for mi and m2 but find them to be visually 
similar for most methods except CH. Galaxy type variations 
in general had little effect on m and c overall, a surprising 
resuh also found in STEP2. 
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Figure C3. Upper panel: Scatter about a linear fit to output versus input shear. Lower panel: multiplicative shear measurement bias 
as a function of galaxy size for RealNoise_Blind. 
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Figure C4. Additive shear measurement bias as a function of galaxy size for RealNoise_Blind. 



© 0000 RAS, MNRAS 000, 000-000 



